173 results found.
Written
Corpus,
Language Type:
Monolingual
Languages:
Portuguese
Availability:
Freely Available
License:
CC BY-SA
Size:
22.4 MByte Production Status:
Existing-used
Use:
Evaluation/Validation
-
Paper title:Measuring the Impact of Readability Features in Fake News Detection
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Roney Santos | Fake.Br Corpus | /N |
Documentation:
https://github.com/roneysco/Fake.br-Corpus
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Arabic English French German Greek Italian Portuguese Russian Spanish
Availability:
Freely Available
License:
CC BY-NC-ND 4.0
Size:
200 Production Status:
Newly created-finished
Use:
Corpus Creation/Annotation
-
Paper title:The Multilingual TEDx Corpus for Speech Recognition and Translation
-
Paper track:12.6 Speech and multimodal resources/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Elizabeth Salesky | Multilingual TEDx (mTEDx) | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Bulgarian Croatian Czech French German Mandarin Polish Portuguese Spanish Thai Turkish
Availability:
From Data Center(s)
License:
ELRA
Size:
18.7 GByteProduction Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:Zero-shot Cross-Lingual Phonetic Recognition with External Language Embedding
-
Paper track:8.11 Cross-lingual and multilingual/accent aspects/Poster Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Heting Gao | GlobalPhone | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
Arabic Catalan Chinese Dutch Estonian French German Indonesian Italian Japanese Latvian Mongolian Persian Portuguese Russian Slovenian Spanish Swedish Tamil Turkish Welsh
Availability:
Freely Available
License:
CC0
Size:
2880 hoursProduction Status:
Newly created-in progress
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:CoVoST 2 and Massively Multilingual Speech Translation
-
Paper track:12.1 Spoken machine translation/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Juan Pino | CoVoST 2 | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Amharic Bosnian Croatian Dari English French Georgian Haitian Hausa Hindi Korean Mandarin Chinese Persian Portuguese Pushto Russian Spanish Turkish Ukrainian Urdu Vietnamese Yue Chinese
Availability:
From Owner
License:
LDC
Size:
215 hoursProduction Status:
Existing-used
Use:
Language Identification
-
Paper title:Modeling and training strategies for language recognition systems
-
Paper track:4.1 Language identification and verification, lang/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | 2009 NIST Language Recognition Evaluation Test Set | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Basque Belgian Dutch Croatian Czech Galician Greek Hungarian Portuguese Slovak Slovenian Spanish
Availability:
From Owner
License:
Size:
None Production Status:
Existing-used
Use:
Evaluation/Validation
-
Paper title:An Approach to Online Speaker Change Point Detection Using DNNs and WFSTs
-
Paper track:5.4 Speech and audio segmentation/Poster Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Lukas Mateju | COST278 database | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Cantonese English French German Gishu Greek Gujarati Hebrew Hindi Indonesian Japanese Korean Mandarin Persian Portuguese Runyankore Russian Spanish Turkish Vietnamese
Availability:
Freely Available
License:
OpenSource
Size:
22.8 GByte Production Status:
Newly created-in progress
Use:
Speech Recognition/Understanding
-
Paper title:Speaking rate, information density, and information rate in first-language and second-language speech
-
Paper track:1.10 Bilingual and L2 acquisition and processing/Oral Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Ann Bradlow | The ALLSSTAR Corpus | /N |
Documentation:
Documentation in English is available to the public (via the project website)Language Type:
Multilingual
Languages:
English German Portuguese Russian Turkish
Availability:
Not Available
License:
-
Size:
38000 words Production Status:
Newly created-in progress
Use:
Discourse
-
Paper title:Multilingual Extension of PDTB-Style Annotation: The Case of TED Multilingual Discourse Bank
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Deniz Zeyrek | Middle East Technical University | TR |
| Author 2 | Amália Mendes | Centre for Linguistics of the University of Lisbon | PT |
| Author 3 | Murathan Kurfalı | Middle East Technical University | TR |
| Main Contact | Deniz Zeyrek | Middle East Technical University | None |
Documentation:
An annotation manual in English exists. Currently only available for the annotators.
Written
Corpus,
Language Type:
Multilingual
Languages:
Dutch Portuguese Spanish french italian
Availability:
From Owner
License:
Creative Commons Attribution-ShareAlike 4.0 International
Size:
18168 entries Production Status:
Newly created-finished
Use:
Document Classification, Text categorisation
-
Paper title:TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling
-
Paper track:Written
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Ben Verhoeven | CLiPS, University of Antwerp | BE |
| Author 2 | Walter Daelemans | University of Antwerp, CLiPS | BE |
| Author 3 | Barbara Plank | University of Copenhagen | DK |
| Main Contact | Barbara Plank | University of Copenhagen | None |
Documentation:
Readme and Technical Report available (English)
Written
Lexicon,
Language Type:
Multilingual
Languages:
Catalan Portuguese Spanish french italian
Availability:
Freely Available
License:
cc-by
Size:
<Not Specified> <Not Specified>Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Leveraging RDF Graphs for Crossing Multiple Bilingual Dictionaries
-
Paper track:Written
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Marta Villegas | Universitat Pompeu Fabra | ES | ||
| Author 2 | Maite Melero | Universitat Pompeu Fabra | ES | ||
| Author 3 | Núria Bel | Universitat Pompeu Fabra | ES | ||
| Author 4 | Jorge Gracia | Universidad Politécnica de Madrid | ES | ||
| Main Contact | Marta Villegas | Universitat Pompeu Fabra | None | Centro Nacional Investigaciones Oncológicas (CNIO) | None |
Documentation:
http://www.semantic-web-journal.net/content/apertiu m-bilingual-dictionaries-web-data




